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Abstract — Despite file-distribution applications are responsible 
for a major portion of the current Internet traffic, so far little 
effort has been dedicated to study file distribution from the 
point of view of energy efficiency. In this paper, we present 
a first approach at the problem of energy efficiency for file 
distribution. Specifically, we first demonstrate that the general 
problem of minimizing energy consumption in file distribution 
in heterogeneous settings is NP-hard. For homogeneous settings, 
we derive tight lower bounds on energy consumption, and we 
design a family of algorithms that achieve these bounds. Our 
results prove that collaborative p2p schemes achieve up to 50% 
energy savings with respect to the best available centralized file 
distribution scheme. Through simulation, we demonstrate that 
in more realistic cases (e.g., considering network congestion, and 
link variability across hosts) we validate this observation, since 
our collaborative algorithms always achieve significant energy 
savings with respect to the power consumption of centralized file 
distribution systems. 

I. Introduction 

The need for a reduction in the carbon footprint of all human 
activities while satisfying an ever growing energy demand 
has triggered the interest on the design of novel energy- 
efficient solutions in several domains. Specifically, recent 
studies reveal that the ICT (Information and Communications 
Technologies) sector is becoming a major contributor to the 
worldwide energy consumption, comparable to the aviation 
sector Furthermore, the energy consumption of the ICT 
sector is expected to double in the next decade fl], unless 
new mechanisms and solutions are implemented. This situation 
has motivated the research community to investigate novel 
mechanisms and solutions for saving energy in ICT, to be 
deployed by telecommunication network operators, Internet 
Service Providers (ISPs), content providers, and datacenter 
owners f3l-f6l. The proposed approaches in the field of energy 
efficient networking at either the device level (e.g. new hard- 
ware design |[7|) or the system level (energy efficient routing 
|8), Q or sleep modes in wired and wireless networks fTO), 
I n] aim to achieve an "energy proportional" network. This is, 
making the energy consumed by the network proportional to 
its traffic load. Specifically, hosts (servers and user terminals) 
are responsible of the major portion of the whole Internet 
power consumption Q. Current energy efficient strategies in 
this domain aim at making the energy consumed proportional 
to the level of CPU or network activity of hosts, and often 
imply switching off or to a low power mode the devices when 
not active. However, energy proportionality of hardware does 



not suffice to define a complete energy efficient framework 
for hosts. Indeed, new solutions must be found that implement 
energy efficient services (e.g. file sharing, web browsing, etc.) 
to optimize the utilization of hosts and network resources. 

In this paper, we focus on the file distribution service, 
which is one of the most widespread services on the Internet. 
Indeed, some of the existing file distribution services, such as 
peer-to-peer (p2p), one-click-hosting (OCH), software release, 
etc., represent a major fraction of current Internet traffic 
|12|-|14|. Despite of the importance of these services, to 



the best of the authors' knowledge, little effort has been 
dedicated to understanding and achieving energy-efficiency in 
the context of file distribution applications. In addition, within 
the context of corporate/LAN networks, other operations such 
as software updates are also file distribution processes. All this 
makes essential to deeply investigate energy-efficiency in file 
distribution, in order achieve a truly Green Internet. 

This paper is a first step into this direction. Our aim is to 
define the analytical and algorithmic basis for the design of 
energy efficient file distribution protocols. For this purpose, 
we first prove that the general problem of minimizing energy 
consumption in a file distribution process is NP-hard. Hence, 
we analytically study restricted versions of the problem, yet 
maintaining a balance between simplicity and applicability 
in real scenarios. Our analysis defines lower bounds and 
proposes collaborative p2p optimal (and near-optimal) algo- 
rithms for reducing energy consumption in the studied file 
distribution scenarios. Afterwards, we present an empirical 
evaluation through simulation, that allows us to [i) validate our 
analytical results and (m) relax several assumptions imposed 
in the analytical study. Simulations show that, even in more 
realistic cases (considering energy costs associated to on- 
off state transitions or network congestion), our collaborative 
p2p schemes achieve significant energy savings with respect 
to centralized file distribution systems. These savings range 
between 50% and two order of magnitude depending on the 
centralized scheme under consideration. 

In summary, the main contributions of this paper are the 
following: 

• We prove that the general problem of minimizing energy 
consumption in a file distribution process is NP-hard. 

• We derive lower bounds for the energy consumed in a 
file distribution process for simple yet realistic scenarios. 

• We design algorithms that achieve optimal (or near- 



optimal) energy consumption for these simple scenarios. 

• We demonstrate that the proposed collaborative p2p 
scheme is an appropriate approach to reduce the energy 
consumption in a file distribution process showing an 
improvement factor of at least 50% with respect to 
any centralized file distribution schemes in the studied 
scenarios. 

• We perform an empirical simulation study that validates 
all the previous statements and quantify the energy sav- 
ings achievable with our algorithms on a representative 
set of scenarios. 

The rest of the paper is structured as follows. Section |ll] 
provides the network and energy model along with defini- 
tions and terminology used throughout the paper Section [III] 
presents theoretical results obtained, in the form of bounds 
and file distributions schemes. In Section IV we present 



our simulation study. Section |V] revises the related work and 
Section VI concludes the paper. 



II. 



System Model, Problem Definition and 
Assumptions 



A. System Model and Assumptions 

We consider a system of n + 1 hosts {n > 1) that are 
fully connected via a wired network. One of these hosts, 
called the server and denoted by S, has initially a file of 
size B that it has to distribute to all the other hosts, which 
we call the clients. We assume that the file is divided into 
/3 > 1 blocks of equal size s = B//3. The set of hosts is 
denoted as H — {S, Hq, Hi, and the set of blocks 

as B = {bo, bi, We will also use in this paper a set 
of indexes, defined as I = {S, 0, . . . , n — 1}. For simplicity 
of notation and presentation, we will often use an index i E I 
to denote a host, and even talk about host i instead of host Hi 
(or S when i = S). 

All the hosts in H can potentially upload blocks of the file 
to other hosts (initially only S can do so). A client can start 
uploading block hi only if it has received bi completely. Hosts 
have upload capacity Ui and download capacity di, for i e I. 
(Observe that the server has upload capacity us-) We assume 
that all capacities are integral. All the hosts are assumed to 
be identical with respect to processing speed, and to have 
enough memory to sustain the distribution process. No host 
can upload more than a block at any given time instant, but 
can simultaneously upload and download from other hosts. 
Moreover, it can simultaneously download from multiple hosts 
as long as the download capacity allows it. We also assume 
that hosts always upload at their full capacity. 

We assume that time in the file distribution process is 
slotted. Each block transmission between hosts starts and 
finishes within the same slot. We assume that no host uploads 
to more than one host in one slot. In general, the slot duration 
may vary from one slot to the next. However, unless otherwise 
stated, we will assume during the rest of the paper that all 
slots have the same duration 7. Then, if the process of file 
distribution starts at time t = 0, the time interval [0, 7] 



corresponds to slot t — 1 and, in general, slot r spans the 
time interval [(r — 1)7, r7]. In each slot of a scheme, a host 
is assigned another host to serve (if any), and the set of blocks 
it will serve during that slot. Note that hosts can only serve 
blocks that have been received completely. 

In this work we consider only the energy consumed by hosts 
during the file distribution process. We do not consider the 
energy consumed by other network devices. In our model, the 
energy consumption has the following three components: 

1) Each host i E I, just for being on, consumes power 
Pi (when a host is off, we assume that it consumes no 
power). 

2) In addition, each host consumes Si > 0, i E I for each 
block served and/or received. 

3) A host consumes energy while being switched on or off. 
If host i E I takes time to switch on or off, the energy 
consumed by switching is given by PiUi. 

B. Problem and its Complexity 

We define a file distribution scheme, or scheme for short, as 
a schedule of block transfers between hosts such that, after all 
the transfers, all the hosts have the whole file. Observe that a 
scheme must respect the model previously defined. Then, the 
problem we study in this paper is defined as follows. 

Definition 1: The file distribution energy minimization 
problem is the problem of finding or designing a file distribu- 
tion scheme that minimizes the total energy consumed. 
The bad news is that this problem is NP-hard even if switching 
on and off is free and there is no additional energy con- 
sumption per block (i.e., ai = 5i — 0,Vi E T). Please 
refer to Appendix |A] for the NP-hardness proof. The good 
news is that, as will be shown later, even though the general 
problem is NP-hard, by making a few simplifying but still 
realistic assumptions, we can solve the file distribution energy 
minimization problem optimally. 

C. Additional Assumptions 

Henceforth, we assume that all the hosts have the same 
upload capacity u, and the same download capacity d. We 
also assume that - = k for some positive integer k. Unless 
otherwise stated, we assume that hosts are switched on and 
off instantaneously, i.e., a; = 0,Vi, and hence switching 
consumes no energy. 

The uniformity of capacities results in a uniform slot 
duration, equal to 7 = ^, for all the block transfers. A host 
is said to be active in a time slot if it is receiving or serving 
blocks in the slot. Otherwise, it is said to be idle. The energy 
A; consumed by an active host i E I in one slot can be 
computed as follows. 
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Without loss of generality, we assume that Aq < • • • < A„_i. 

In some cases below we will assume that the system is 
energy-homogenous . This means that all hosts have the same 
energy consumption parameters, i.e.. Pi = P and 5i = 5, for 





(a) Tree slot 



(b) Slot with cycle 



Fig. 1. A slot as a directed transfer graph. The number of blocks served 
in |l(b)| is one more than the number of blocks served in |l(a)| with the same 
energy consumption. 



all i e I. In such a homogeneous system, also all hosts have 
the same value of = A. Note that, unless otherwise stated, 
we assume a heterogeneous system. 

Let us consider parameters n, k, and /3 of the file distribution 
energy minimization problem. Let us define the set of all 
possible schemes with these parameters by Z^''^ ■ Let E{z) 
be the energy consumed by scheme z G Zj!'^ . 

Definition 2: A scheme zq G ZJ^'^ is energy optimal (or 
optimal for short) if E{zo) < E{z),\/z S Z^'^ . 
Hence, our objective in the rest of the paper is to find optimal 
(or quasi-optimal) schemes. 

D. Normal Schemes 

To rule out redundant and uninteresting schemes, we will 
consider only what we call normal schemes. Observe that the 
block transfers of a scheme 2; in a slot t can be modeled as 
a directed transfer graph with the hosts as vertices and block 
transfers as edges (see Fig. [TJ- Then, a normal scheme is a 
distribution scheme in which there are no idle hosts, there are 
no slots without active hosts, and each slot has a connected 
transfer graph. We denote the set of normal schemes with 
parameters n, j3, and k by 2^''^. From now onwards, we 
will consider only normal schemes. It is easy to observe that 
any optimal scheme can be transformed into a normal scheme 
that is also optimal. Hence, we are not losing anything by 
concentrating only on normal ones. 

Observe that in a transfer graph the out-degree of each 
vertex is at most 1 (by the upload constraint). Thus, the transfer 
graph of a slot in a normal scheme can either be a tree (Fig. 
1(a) I or a graph with exactly one cycle (Fig. 1(b) 1. Note also 



that in a slot with cycle all hosts upload blocks, while in a 
tree slot there are hosts that do not upload. 

E. Costs 

Let us consider scheme z G Z]^'^ ■ Denote with Z.^ C I the 
indexes of the set of active hosts in time slot r under scheme 
z. 

Definition 3: The cost of slot t under scheme z, denoted 
c^, is the energy consumed by all active hosts in r, i.e. 



Let t| be the makespan of scheme z, i.e., the time slot of 
z in which the distribution of the file is completed. Then, the 
energy consumed by the scheme z can be obtained as 



(2) 



The cost of a slot, as defined above, does not take into 
account which host is serving which block to which host. 
However, the total energy consumption of a scheme also 
depends on this. Thus, for a better insight on the schemes, 
we also associate a cost to a block transfer. 

We denote the set of blocks downloaded by host i G I in 
slot T under scheme z by 5f ^ and the index of the host serving 
bj G as serv{j,i). 

Definition 4: We define the cost c| ^ of a block bj received 
by Hi under scheme z as. 



where, if bj is received by Hi in slot r. 



(3) 



if j- = min{/|V G5,f,J 
Otherwise 



if 5f 



serv{j ^i) ,r 

Otherwise 



V^j i accounts for the energy consumption of host Hi (in 
units of Ai) that is receiving the block. A block contributes 
to the energy consumed by Hi if it is downloading. If a 
host is downloading more than one block in parallel, then we 
assume that only one block adds to the cost, as the rest of the 
blocks can be received without incurring any further cost. Uji 
accounts for the energy consumption of the host that is serving 
the block when S^^^,^^^ j-j ^ = (the host that is serving bj to 
Hi is not downloading any block). 

With the above definition, the sum of the costs of all blocks 
transferred in slot t should be equal to the cost of the slot r, 
c^. The next result establishes that this is indeed true for all 
the schemes. The proof can be found in Appendix [B| 

Theorem 1: The sum of the costs of all the blocks trans- 
ferred during slot r is equal to the cost of that slot, i.e.. 



E E 4. 
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Thus, we can express the energy of a scheme z in terms of 



the cost of blocks , as 
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III. Theoretical Analysis 

In this section we provide analytical results for the file 
distribution energy minimization problem, under the additional 
assumptions described previously. The results in this section 
are classified depending on the ratio k between the download 
and upload capacities. First, we derive lower bounds on the 
energy consumption, and provide optimal schemes for the case 
k — 1. For fc > 1, we provide optimal and near-optimal bounds 
and algorithms. 

A. Download Capacity = Upload Capacity 

In this setting, a host can download at most one block during 
a slot. We first provide lower bounds on the energy consumed 
by any scheme. Then, we present several optimal schemes, 
and we derive the value of (3 that minimizes the energy of 
optimal schemes in energy-homogenous systems. 

7 ) Lower Bound: The following theorem provides a lower 
bound on the energy consumed by any distribution scheme 
when fc = 1. 

Theorem 2: The energy required by any scheme z to dis- 
tribute a file divided into /3 blocks among n clients when 
k — d/u = 1, satisfies 

E{z) > /3 ^As + + max{0, n - 13} min{As, Ao} 

The key observation behind this result is that each host has to 
be active for at least f3 slots to receive the file, whereas the 
server has to be active for at least /3 slots to upload one copy 
of each block among the clients. The proof of the theorem can 
be found in Appendix |C] 

2) Optimal Distribution Schemes: We now present optimal 
schemes achieving the lower bound of Theorem |2] We distin- 
guish among three cases, depending on the relation between 
n and /?, and we indicate the resulting schemes as Algorithms 
[T| [2j and [5] Note that in pseudocode, the transfer of block bj 

from host H to host H' is expressed as H ^ H' . Also, all 
the transfers that occur in the same slot are enclosed by the 
lines begin slot and end slot. While the three algorithms could 
be merged into a single one, we have chosen to present them 
separately for clarity. 

We now provide some intuition on the algorithms. We start 
from Algorithm [T[ which assumes that the number of cUents 
is equal to the number of blocks. As each host has to be active 
at least (3 slots to receive the complete file. Algorithm [T] makes 
sure that the hosts are active for exactly /3 slots. In the first 
n slots of the algorithm, the server uploads a different block 
of the file to each of the n clients. Since n = j3, the server 
can upload the whole file to the clients in n slots. Then the 
server goes off At this point, all the hosts have one block and 
they all need to get the remaining ?i — 1 blocks. Each client 
chooses a client to serve, in a way that the resulting transfer 
graph is a cycle of n nodes. All the hosts start uploading the 
latest block they have received, and this process continues for 
(3—1 slots, until all the hosts have all the blocks. 
Algorithm |2] which assumes n < /3, is more involved, but 




Fig. 2. Example of Algorithmic] for n = 3 and /3 = 4. The label on each 
arrow is the index of the block being served. 



uses similar ideas as Algorithm [T| In Fig. |2] we present a 
toy example of an scheme obtained from Algorithm |2] In 
Algorithm[3] the number of clients is larger than the number of 
blocks. Thus some hosts will have to upload the same block 
more than once. In this algorithm, after that the server has 
served the first /3 blocks, the host with the smallest energy 
consumption per slot uploads block &o to those hosts without 
any block. 

Theorem 3: When d = u. Algorithms [l] |2] [5] describe 
optimal distribution schemes, with energy 

E{z) = /? I^As + X] + max{0, n - (3} min{A5, Ao} 

For the proof, please refer to Appendix |D] In what follows, 
with Opt{n, j3) we indicate the algorithm corresponding to the 
values of n and /3. 



Algorithm 1 Optimal scheme for j3 ^ n 

1 : for j = ; n — 1 do 
2: begin slot 

3: S A Hj 
4: end slot 

5: end for 

6: for j = n : 2n — 2 do 

7: begin slot 

8: for i = : n — 1 do 

ji+i) mod n 
y- tli ^^(i-l)modn 

10: end for 
1 1 : end slot 
12: end for 



3} Optimal Number of Blocks in Energy Homogenous Sys- 
tems: In this section we consider an energy-homogenous 
system, in which all hosts have the same energy consumption 
parameters, i.e.. Pi = P and 5i = 5, for all i e I. In this 
system we want to find the optimal value of (3 into which 
the file should be divided for minimum energy consumption. 
Intuitively, the number of blocks into which the file must be 
divided depends on the value of 5. If 5 is very large, then it 
is better to divide the file in a small number of blocks, since 
each block transmission consumes additional energy 5. On the 
other hand, if 5 is small, we can divide the file into a number 
of blocks such that the energy consumed is reduced due to 
concurrent transfers. 

The following theorem presents the optimal value of (3. 

Theorem 4: In a energy-homogenous system with fc = 
d/u = 1, the value of (3 that minimizes the energy consump- 



Algorithm 2 Optimal scheme for /3 > n 

for j = : n — 1 do 
begin slot 



1 do 



- 1 do 



end slot 
end for 

for j = n : j3 - 
begin slot 

S A H„-i 
for i = 1 : 1 

end for 

end slot 
end for 

for i = /3 : /3 - 
begin slot 
for i = 1 ; n do 

(i+j-n) mod ^ 
mod n ' 

end for 
end sZoi 
end for 



2 do 



Algorithm 3 Optimal scheme for jS < n. 

i?min is the host with smallest A^. (i?min G {5*, iJo}-) 



for j = : /3 - 
begin slot 



1 do 



5 ■ 



end slot 
end for 
for j = P : 

begin slot 

Hjnin — > + 1 

for i = 1 : /3 - 

end for 

end slot 
end for 

for j = n : n- 
begin slot 

H, 



1 do 



1 do 



2 do 



0-1 



2U-U + 1) 
for i = : /3 



^ H, 



2 do 



n + l3-(j + 2) 



(n + i — j) mod n 

end for 
end slot 
end for 



(n + i — J — 1) mod n 



tion of an optimal scheme is 



/3 = min<; \/^," 



(6) 



Note that if the value of ^/ ^ is not an integer, it has to 
be rounded to one of the two closest integer values, such that 
E{/3) is minimum. 

B. Download Capacity > Upload Capacity 

In this subsection, we consider an energy homogenous 
system in which fc > 1. 

7 ) Lower Bound: In this section, we present a lower bound 
on the energy of a schedule in an energy homogenous system 
with fc > 1. In this setting, the possibility to download more 
than one block in a slot implies that the minimum number of 
slots in which a host has to be on can be less than (3. 
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Fig. 3. A representation of Algorithm|4]to visualize tlie distribution of blocks 
using the ideas of Algorithm [T| and |2] 



Theorem 5: Let z be an optimal schedule in an energy 
homogenous system. Then the energy consumed by z satisfies 



E{z) >n(/3 + l)- A 



(7) 



The derivation of this bound is based on proving that the 
required number of tree slots is at least n, because there are 
n clients. For the complete proof, please refer to Appendix |F] 
2} (Quasi-)Optimal Distribution Schemes: Observe that the 
energy consumption of Algorithms [T| and [5] in an energy 
homogenous system with /? < n is exactly n{/3 + 1)A (The- 
orem |3]l. Hence, these algorithms describe optimal schemes 
for this system. However, if /3 > n, the algorithm for k — 1 
(Algorithm |2| is not optimal anymore if fc > 1. In this section 
we present an algorithm, namely Algorithm]?] that describes 
a distribution scheme for this case. In fact, the scheme works 
with fc = 2, as no host has more than two downloads in 
parallel. 

Algorithm]4]distributes the file among the clients using ideas 
from Algorithms ]T] and ]2] We represent the state of process 
with a two dimensional array A of size nx /3 (Fig. ]3]l with the 
rows and the columns representing the clients and the blocks, 
respectively. We set an entry Aij = 1, t e {0, 1, .., n — 1}, j e 
{0,l,..,/3 — 1} if and only if Hi has received bj, and 
otherwise. At the beginning, all the entries are and after the 
completion of the algorithm they all should be 1. Furthermore, 
imagine the array A divided in J ^1 square subarrays of size 
nx n and one rectangular subarray of size nx (n + b). (Note 
that this is just a conceptual division to understand Algorithm 
]4]in terms of Algorithms]!] and ]2]) 

After the first loop, the diagonal of the first square subarray 
is set to 1, i.e.. An = 1,V« G {0, ...,n — 1}. Additionally, 
after the second loop, the top left corner position (see Fig. ]3]l 
of each subarray has also been set to 1, i.e., Aqj — l,Vj S 
{Q,n,2n, .., ([^J — l)n}. In each iteration of the for loop at 
Line ]T2] the elements of one of the subarrays of n x n are 
set to 1 by serving in the same fashion as in Algorithm ]!] 
while the server completes serving the diagonal of the next 



square/rectangular subarray. When Line 22 is reached, all the 
elements of all the square subarrays are marked as L The 
remaining blocks are served using Lines ]6p0] of Algorithm ]2] 
with an appropriate relabeling of the blocks. 

We present the bounds achieved in this section in the 
following theorem. The proof of the second claim can be found 
in Appendix ]G] 

Theorem 6: In a homogeneous system with fc > 1, 
• If /3 < n, then Algorithms ]T] and ]3] describe optimal 
distribution schemes with energy E{z) = n{/3 + 1) • A. 



Algorithm 4 Energy saving scheme for case k = 2 and /? > 



b = 13 mod n 
for j = : n ' 
begin slot 



1 do 



S ■ 



1 do 



end slot 
end for 

for j = I : 
begin slot 

S^Ho 
end slot 
end for 

for « = : I ^ I - 2 do 
for j = : n — 2 do 
begin slot 

(i + l)n+j + l 



5 ■ 
for ', 



= : n - 1 do 

Zn+((i+j) mod n) 



end for 
end slot 
end for 
end for 

Run Lines 



to bj , Vj e 



6|20 



(i — 1) mod n 



of Opt{n, n + b) after renaming the block 6^_(„_|_[,)_|_j 
l,..,n + 6- 1} 



If /3 > n, then Algorithm |4] describes a distribution 
scheme with energy 



E{z) = U(/3+ 1) + 



■ A 



(8) 



where, b = /3 mod n 
While Algorithm [4] does not achieve optimal energy when 
> n, it is quasi-optimal, since it is off from the lower bound 
by an additive term of ( [/3/nJ +6 — 1)A, which is smaller than 
the term n(/3 + 1)A. It is important to note that Algorithm |4] 
uses k — 2. Then, the upper bounds on the minimum energy 
presented here hold for all values of fc > 1. 

IV. Performance Evaluation 

In order to assess the performance of our scheme, we have 
run an extensive simulation study with two objectives. First, to 
evaluate quantitatively the results of our analysis in Section III 
Second, to understand the impact on the performance of our 
schemes of some effects (like energy cost associated to on/off 
transitions, network congestion, or the variable power con- 
sumption among the devices involved in the file distribution 
process) not considered in our analysis, but typical of real 
scenarios. 

A. Experimental Setup 

In this section we briefly present a description of the 
experimental setup. 

1) Scenarios: In our experiments we have considered two 
different scenarios, corresponding to two different application 
contexts for the file distribution problem. 
- Homogeneous scenario: In this case, all the hosts participat- 
ing in the file distribution process have the same configuration. 
Specifically, we have considered the following values for the 
relevant input parameters in our experiments: nominal power 
P = 80 W, (5 = 1 Joule, and upload and download capacity 



u — d = 10 Mbps. Finally, unless otherwise stated, we 
consider a scenario with one server and 200 hosts. 

This homogeneous scenario models a corporate network 
in which both the network infrastructure and the whole set 
of devices belong to the same company/organization, and 
are centrally managed. Typical file distribution processes in 
this context are software updates (e.g. OS, antivirus), which 
are usually centrally coordinated by system administrators. 
These environments are typically characterized by a relatively 
high uniformity in the network infrastructure and in the 
user terminals, especially if compared with the Internet. It 
is expected that communications among hosts in this type 
of intranet scenario happen at high bit rates, and that the 
bottleneck for file transfers happens at the terminals rather than 
in the network. Finally it is worth to mention that, in these 
settings, energy expenditure is a concern for the organization, 
as it directly impacts the OPEX of the IT infrastructure. 

- Heterogeneous scenario: In this setting, we analyze the 
impact of heterogeneity in host configurations on the per- 
formance of our schemes. This scenario captures the case in 
which hosts are typical Internet nodes (including home users), 
and it is therefore characterized by a significant variability 
across hosts in both the energy consumption profile and the 
observed network performance (i.e. different access speed 
and congestion conditions). In this case, the file distribution 
process is represented by, for instance, a software being 
releasee^ (e.g., a new Linux distribution). In this scenario, 
the incentive for saving energy comes from corporate and 
indvidual sensibility towards reducing the carbon footprint, 
since the potential economical benefits for a single host are 
usually negligeable. 

In this setting we assume Ui = di,yi G X. In order to 
simplify our study, in our experiments we consider separately 
the effect of heterogeneity in power consumption and the effect 
of varying network conditions. 

2) File Distribution Schemes: The file distribution schemes 
that we have considered in the performance evaluation are: 

- Opt: This is the file distribution scheme detailed in Sec- 
tion |III^ It is a distributed scheme, since the upload capacity 
for distributing the file is made available by the same hosts 
that are downloading the file. 

- Parallel: This is a centralized scheme, in which all users 
download the same file at the same time from the same server 
in parallel. This is one of the most common architectures 
for file distribution, and it models a large number of file 
distribution services present in the current Internet (e.g.. One 
Click Hosting systems such as Megaupload or RapidShare). 

- Serial: In this centralized scheme, the server uploads in 
sequence the complete file to the hosts involved in the file 
distribution process. That is, the server uploads the complete 
file to the first host. Once it finishes, it uploads the file to 
the second host, and so on. We consider this scheme because 
when Ui = di it minimizes the amount of time each host is 



'Otlier applications such as entertainment content (video, music) file 
distribution also fit into this scenario. 



active in order to receive a file, and therefore the amount of 
energy spent by each host in the distribution process. This is 
realized at the expense of the server, who has to remain on 
for the whole duration of the scheme. 

3) Energy Model: For our experiments we considered two 
different energy models. In a first one, the hosts only have two 
power states: an OFF state, in which they do not consume any- 
thing, and an ON state, in which they consume the full nominal 
power, equal to SOW (typical nominal power consumption for 



notebooks and desktop PCs lies in the range 60W-80W 1 15 1). 



Unless otherwise stated, this is the default energy model for 
our experiments. 

In order to understand the impact of load proportional energy 
consumption in our schemes, we consider a model that fits 
most of the current network devices [15J , in which the energy 
consumed has some dependency on the CPU utilization and 
network activity. This energy model is characterized by four 
states. Besides the OFF state, the other states are: the IDLE 
state, in which the device is active but not performing any 
task, and consuming 80% of the nominal power; the TX-or- 
RX state, in which the device is active and either transmitting 
or receiving, and consuming 90% of the nominal power; the 
TX-and-RX state, in which the device is active and both 
transmitting and receiving, and consuming its full nominal 
power. We considered this model to analyze the impact of 
load proportionality on the overall energy consumption of the 
schemes considered in our experiments. 



In Section IV-C 1 we analyze the effect of having devices with 
heterogeneous power consumption profiles. For this purpose 
we use the previously described two-state model, but we 
assume that for each host its nominal power consumption 
is drawn from two different distribution: {i) a Gaussian 
distribution with an average of 80 W and a standard deviation 
of 20 W, and (m) an exponential distribution, with an average 
of 80 W. 

Note that, despite large servers typically present a larger 
nominal power, in our experiments we assign to the server 
the same nominal power as a regular host. This assumption is 
consistent with our intention to be conservative in our study, 
since our schemes require the server to be active far less time 
than the serial and parallel schemes. 

4) Goodness Metric: The goodness metric we have used 
in order to compare the energy consumption of different file 
distribution schemes is energy per bit, computed as the ratio 
of the total amount of energy consumed by the distribution 
process, divided by the sum of the sizes of all the files 
delivered in the scheme. 

B. Homogeneous Scenario 

1) Validation of the Analysis: In Fig. |4] we have plotted 
the energy per bit consumed by the file distribution process 
as function of the size of the file, for the three different 
file distribution schemes considered. As we can see, our 
schemes perform consistently better than both serial and 
parallel schemes. In particular, by maximizing the amount of 
time in which hosts serve while being served, our schemes 



tend towards reducing by half the total energy cost of serving 
a block with respect to the serial scheme. This performance 
improvement with respect to the serial scheme is due to the use 
of (p2p-like) distribution, and indeed it decreases as the file 
size (and the number of blocks into which it is split) decrease. 
With respect to the serial scheme, our optimal schemes make 
the most out of the energy consumed by all hosts which 
are active and being served at a given time, by having them 
contributing as much as possible to the file distribution. As a 
consequence, despite each host spends more time in an active 
state than in the serial scheme, the net effect is a decrease of 
the total energy. 

Moreover, we can also observe how the parallel scheme per- 
forms consistently worse than any other scheme, consuming up 
to two orders of magnitude more than the serial scheme. Since 
the utilization of this parallel scheme is widespread in the 
current Internet, our observations confirm the great potential 
of distributed schemes for saving energy. 

Fig. |4] also depicts the performance of our Opt algorithm for 
different number of hosts (50, 200, and 400). We observe that 
the energy per bit consumed by our algorithm as well as by 
the serial scheme are not affected by the number of hosts in 
the scheme. Hence for the rest of the section we will present 
results exclusively for a setting with 200 hosts. 

Finally, it is worth noting that, for the optimal scheme, 
the nonsmooth variation of the energy per bit with file size, 
observable at low values of file size, is due to quantization in 
the number of blocks. The serial and parallel schemes (for 
which there is no partition of the file into blocks) have a 
smoother behavior with respect to file size. 

2) Block Size: The impact of the total number of blocks 
on the energy consumed by our Opt scheme can be seen in 
Fig.|5] where we plotted the energy per bit consumed with Opt 
for variable file sizes, and for a total of 200 hosts. The green 
curve corresponds to the case in which a fixed block size, 
equal to 256 kB, is used, while the lower red one is obtained 
by using an optimal block size, according to the formula in 



Section III-A3 We see how the use of an optimal block size 



leads to an increment in energy savings mainly for small file 
sizes. The reason is that for small file sizes a fixed block 
size leads to a small number of blocks, and consequently to 
exploit less the distributed (p2p-like) mechanisms which, in 
our scheme, improve the efficiency of the distribution process. 

3) ON/OFF Energy Costs: As seen in previous sections, 
our optimal algorithms develop in rounds. Typically, not every 
host is on in every round (i.e., some go on and off more 
than once during the file distribution process). In a realistic 
scenario, a host takes some time to both go off (or into a very 
low power mode), and to get back to active mode. Usually, this 
on/off time is in the order of a few seconds 1 16 1. The additional 
amount of energy consumed while switching between these 
power states (that we call here "on/off costs") has potentially 
an important impact on the energy performance of a scheme, 
penalizing specifically those schemes in which host activity is 
more "discontinuous" over time. 

In order to mitigate the negative impact of on/off costs. 
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Fig. 4. Energy per bit consumed by Opt in function Fig. 5. Impact of the choice of number of blocks 
of file size, compared with the serial and the parallel on the energy per bit consumed by our algorithm, in 
scheme. Block size: 256kB. function of file size, with 200 hosts. 
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Fig. 6. Impact of on/off energy cost on the energy 
per bit consumed by our algorithm, in function of 
file size, with 200 hosts. 
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Fig. 7. Impact of the energy model on the energy Fig. 8. Impact of heterogeneity in nominal power 
per bit consumed by our algorithm, in function of on the energy per bit consumed by our algorithm, 
file size, with 200 hosts. in function of file size, with 200 hosts. Curves are 

plotted with 95% confidence interval. 
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Fig. 9. Impact of variable network conditions on 
the energy per bit consumed by our algorithm, in 
function of file size, with 200 hosts. Curves are 
plotted with 95% confidence interval. 



in our simulations we implement the following mechanism. 
When a host A has finished its activity (i.e. uploading or/and 
downloading a block) in an slot ti, and has no activity until 
slot t2, it computes the energy cost of staying on (coston) 
until the slot t2 and the cost of going off during the rest 
of slot ti and switching on at the beginning of slot t2 
(cos^off/on)- Hence, if coston < cosioff/on^ ^ decides to stay 
on. Otherwise, it goes off for its non-active period between 
slots ti and t2- 

Fig. |6] presents the energy consumed by our scheme in 
comparison to the serial scheme considering a switch on/off 
time equal to 2 and 4s. As expected, the on/off costs increase 
the energy per bit consumed by all schemes. This increment 
is more pronounced for small file sizes, where we see that 
on/off costs make the performance of our scheme closer (but 
still better) to the serial scheme. Conversely, for medium/large 
file sizes, the contribution of on/off costs to the total energy 
consumed by a scheme becomes marginal, and the perfor- 
mance of both the optimal scheme and the serial approaches 
the one in the case without on/off costs. Note the widening 
of the gap between the serial scheme and our scheme for file 
sizes around 50MB is due to the different behavior that our 
scheme has for the case n < f3 and for the other case. 

4) Load Dependency: In this set of experiments, we have 
analyzed the impact of the four-states energy model described 
Section 



m 



IV-A3 



which implies some degree of energy 
proportionality of the host devices. The research community is 
putting a lot of effort in energy proportionality. Hence, in the 
future it is expected that network devices will consume energy 
proportionally to the supported load. Fig. |7] shows that with 



the four-states energy model the percentual decrease in the 
energy per bit consumed by our Opt scheme and by the serial 
one is the same. This suggests that even with load proportional 
hardware our scheme enables significant energy savings with 
respect to the serial one. 

C. Heterogeneous Scenario 

In this subsection we consider two separated heterogeneous 
scenarios. On the one hand, we study the case in which 
different hosts present different power consumption profiles. 
On the other hand, we address the scenario in which each host 
observes different network conditions (i.e., different access 
speed and congestion level). 

1) Heterogeneous Power Consumption: In Section |III-A| we 
have proved analytically that our Opt algorithm minimizes 
the overall power consumption of the file distribution pro- 
cess, even in a heterogeneous scenario in which each host 
presents a different energy consumption (as long as all the 
nodes have the same upload and download rate). To validate 
this statement, in this subsection we have run experiments 
in which the nominal power consumed by the hosts varies 
according to either a Gaussian or an exponential distribution 



as defined in Section IV-A3 Then, the energy consumption 



has been compared with a homogeneous scenario. The results, 
presented in Fig. [8] validate our analysis, since the three curves 
for the Opt scheme overlap perfectly. We also observe that 
heterogeneous power consumption has some minor impact in 
the case of the serial scheme. Finally, it is worth to note that 
confidence intervals have been calculated for each curve (but 
not shown for clarity), being in any case lower than 5%. 



2) Heterogeneous Network Conditions: In the results pre- 
sented we have considered (i) similar upload/download access 
speed for all host and (m) no network congestion. In this 
subsection we relax these assumptions, and consider a het- 
erogeneous scenario where hosts have different access speeds 
and observe different network state (e.g., congestion). This 
scenario accurately models a content distribution process in 
the Internet. 

In particular, in the simulations we model the different 
nominal access speed of hosts using an exponential distri- 
bution, based on realistic speed values provided in p7) . 
Additionally, in order to model the variation in link speed over 
time due to network conditions (i.e., congestion) we multiply 
the nominal access speed by a positive factor taken from a 
Gaussian distribution with average 1 and standard deviation 
0.07. Fig. |9] presents the results for these heterogeneous 
network conditions, for both our Opt scheme and the serial 
scheme, and compares them with the homogeneous case. The 
results show that both schemes suffer from an increment in the 
power consumption, with respect to the homogeneous case. 
However, the relative difference between the Opt and serial 
schemes increases. This suggests that even in heterogeneous 
network conditions the proposed algorithm outperforms any 
centralized scheme. 

Moreover, we observe that the energy per bit consumed 
is constant for both Opt and serial schemes when consider- 
ing heterogeneous network conditions. This occurs because 
none of the considered schemes takes into account host 
upload/downlad capacity in determining the schedule for file 
distribution. 

Finally, note that confidence intervals have been obtained 
for the different curves and all of them present less than 5% 
difference to the average value in the figure. 

V. Related work 



Energy-Efficiency in Networks: In order to reduce the overall 
energy consumption of the Internet, many dimensions for 
energy savings have been explored. The main efforts include 
turning off the devices that are unnecessarily on pO) , 
aggregating traffic streams to send data in bulk pO), ||18), 



network planning |20|, energy efficient routing ||8)7|0 and 
virtualization and migration of routers pT] . Furthermore, some 
works have addressed specific aspects of energy-efficiency in 
datacenters |5|, |22|, |23|. 

Optimization problems in file-distribution processes: An 

important amount of effort has been dedicated to study the 
completion download time in a file distribution process |,24J- 
p6) . The minimization of the average finish time in P2P 
networks is considered in p7)-p9). Of interest to this paper, 
^Q\ presents a theoretical study to derive the minimum time 
associated to a P2P file distribution process. However, an 
scheme guaranteeing a file distribution with minimum time 
does not generally leads to minimize the energy consumption. 
Moreover, schemes with similar distribution time may have 
different energy costs. 



Energy-Efficiency in file distribution: To the best of the 
authors knowledge energy consumption in file distribution 
processes has received little attention so far. On the one hand, 
practical studies Q, |[3T)-p4) have discussed and compared 
the energy consumed by different content distribution archi- 
tectures or protocols. However none of them relies on an 
analytical basis nor aims to design optimal algorithms, as 
is the case of our paper On the other hand, Mehyar et al. 
|35| and Sucevic et al. |36J (similarly as we do) address 
the energy-efficiency in file-distribution from an analytical 
point of view. However, their studies are restricted to P2P 
schemes whereas the current paper cover both centralized and 
distributed approaches in order to identify the most efficient 
scheme. In addition, their analysis is limited to networks of at 
most 3 nodes. For bigger network sizes, they provide heuristics 
and use simulations to evaluate energy efficiency. Instead, our 
analysis is valid for an arbitrary number of nodes. Finally, it is 
worth to mention that, to the best of our knowledge, we are the 
first on providing a proof of the NP-hardness of the energy- 
efficiency optimization problem for file-distribution processes. 

VI. Conclusions 

This paper presents one of the first dives into a novel 
and relevant field that has received little attention so far: 
energy-efficiency in file distribution processes. We present 
a theoretical framework that constitutes the analytical basis 
for the design of energy-efficient file distribution protocols. 
Specifically, this framework reveals two important observa- 
tions: (i) the general problem of minimizing the energy 
consumption in a file distribution process is NP-hard and (m) 
in all the studied scenarios there exists always a collaborative 
(i.e. p2p-like) distributed algorithm that reduces the energy 
consumption of any centralized counterpart. This suggests 
that in those file distribution processes in which reducing the 
energy consumption is of significant importance (e.g. software 
update over night in a corporative network) a distributed 
algorithm should be implemented. 
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Appendix 

A. NP-hardness 

We show in the section that a general version of the problem 
considered in this paper is NP-hard. The following theorem 
summarizes the result. 

Theorem 7: Assume that time is slotted, that hosts must 
upload at their full capacity, and that no host can upload 
to more than one host in the same slot. The problem of 
minimizing the energy of file distribution is NP-hard if hosts 
can have different upload capacities and power consumptions, 
even if ai = 5i = 0,Vi. 

Proof: We use reduction from the partition problem. The 
input of this problem is a set of integers (we assume all of 
them to be positive) A = {xq, X2, Xk-i}, k > 1. Let M = 
J2x eA to be even. The problem is to decide whether there 
is a'subset A' c A such that J2xieA' = 

We reduce an instance of the partition problem to an 
instance of our problem as follows. The file to distribute has 
M blocks of size 1. There are n = A;-|-3 hosts: server S, hosts 
T and R, and hosts Hi, for i € [0, fc — 1]. All hosts have fixed 
setup energy 5i = and no cost for switching on and off, i.e., 
ai = 0. Server S has upload capacity M and power P. Host 
T has download and upload capacity M, and power P. Hosts 
Hi, i G [0,k—l], have download capacity M, upload capacity 
Ui = Xi, and power consumption P. Host R has download 
capacity M/2 and power consumption P' > 2P{2k-\-l). The 
slot length is one unit of time. 

Observe that there is always a feasible solution that respects 
the assumptions of the model. It works as follows. First, S 
serves the whole file to T in one slot. Then, T serves the 
whole file to hosts Hi, i e [0, fc — 1], in consecutive slots. 
Finally, each host Hi, i G [0, fc — 1], serves Xi different blocks 
to -R in consecutive slots. 

We claim that the subset A' that satisfies J^xteA' = -^/^ 
exists if and only if the file distribution problem can be solved 
with energy smaller than 3P'. Hence, the energy minimization 
problem is NP-hard. 

If subset A' exists, the following schedule is feasible. First, 
S serves T the whole file in one slot. Then, T serves each 
host Hi, i G [0, fc — 1], the whole file in consecutive slots. Let 
U = UxieA'{Hi}, then the hosts in U upload the file to R in 
two slots, half the file in each slot. The total energy consumed 
is 

E = 2P + 2Pk + 2{\A'\P + P') < 2P{2k + l) + 2P' < 'SP' . 

Assume now that there is a schedule with energy less than 
3P'. Then, R has been up two slots. Since they cannot upload 
at full capacity to R, and they cannot serve more than one 
host, neither S nor T can serve R. Then, looking at the first 
slot in which R is up, R must have been served by a subset 
of hosts Hi whose aggregate upload capacity is exactly M/2. 
This proves the existence of A'. ■ 



B. Proof of Theorem [7] 

We transform the cost of a block as defined in Equation [3] 
to the following one. For each host i E I^, define 4>i and 4'i 
as 



from Lemma |2] and Corollaries [T] and |2) 

ri-l P-l 

^ ^ A,e™(,-,)-Z^/^, > /3-A5+max{0,n-/3}-min{As, Ao} 



1=0 j=0 



(11) 



Note that ^, 5. 2?; 
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Otherwise 



Adding Equations 10 and[TT[ the claim follows. 



1 iff |5f^| > 1 (i.e., when 

= A, 

i.e., , ■ ■•, =0. Therefore, for a host i £ Xi, either 
Ai or 7/;^ = A;, never both or both A^. Hence, 



1,1 

Ai). It is easy to see that W^ ^ = 1 iff V'serDQ-.j) = ^sert,(i,j) 



D. Proofs of Correctness and Optimality for k — 1 

For the correctness and optimality proofs of a scheme z 
(described by an algorithm), we define the state erf ^ of a host 
z S I at the end of slot t as the set of blocks held by that 
time at the host. Thus, to start with, initially for S we have. 



(Tg — for each client i e {0, 



1}, 



i4>i + i'i) 



A,; 



C. Proof of Theorem^(Lower Bound for k = 1) 

The claim to shown is that if fc = 1 any scheme z consumes 
energy 

Eiz) > /3 I^As + + '^^^^{O, n-P} min{ A5, Aq} 

Before proving the claim, we need some supporting claims. 
Lemma 1: For every block hj and every client Hi it holds 
that X>f ,, = 1. 

Proof: Since d — u, each host can receive only one block 
in a time slot. Hence, if block bj is transferred to client Hi in 
slot T, we have |5f^| — 1. Then, by definition, j = 1. ■ 

Lemma 2: For every block hj served by S to client Hi, it 
holds = 1. 

Proof Let S be serving hj to Hi in slot r. Then, Sg , 
is always 0, because the server never receives any block from 
the clients, which means that Uji — 1 for any block hj served ai 
by S. ■ 

Since S has to serve each block of the file at least once, we 
obtain the following corollary. 

Corollary 1: For at least /3 block transfers Uji — \. 

Lemma 3: If there exists a host H that is receiving its first 
block in a time slot t, then there is at least one block bj in t 
such that Ufi = I. 

Proof The number of active hosts in slot r is |If |. At 
most — 1 blocks can be transferred in r because host H 
cannot upload to anyone. Then, since d = u, there exists at 
least one host H' that is on only for uploading. Let bj be 
the block served by H'. As it is not downloading any block, 
= and hence = 1. ■ 

Corollary 2: There are n hosts that receive a block for the 



If z is correct, after the makespan of z (r| slots) the state of 
every client i G {0, n — 1} must be af^z = B. We omit z 
and r when clear from the context. 

1) Algorithm^ Let us denote the scheme described by 
Algorithm [T] as zi. This scheme has the following properties. 

Observation 1: After the /or loop at Lines T||5 the state of 
client z is (Ji = {6^}, Vz e {0, .., n — 1}. 

Lemma 4: After the g*'^ iteration of the loop at Lines 6p2 
for q E {0, ...,n — 1}, each host Hi, i E {0, ...,n — 1} has 
(9^tate ^ 

= {b(i+p) mod n} 
p=0 



(12) 



Proof We prove the claim by induction on q. The base 
case {q = 0) holds from the observation: After the /or loop at 
lines [T]|5 



= {b^}■ 



Assuming the hypothesis to be true for g — 1, in the q 



,th 



iteration Hi receives block h. 



(z+j + l) mod 1 



In this iteration, the 



■ q — 1. Hence, Hi receives &, 



value of j is j = 

and the state after the g*'^ iteration is 



i+q) mod i 

p=0 



i-\-p) mod ^ 



,} (13) 



first time. Thus, for at least n block transfers U 



1. 



We now prove the claim. In order to compute the minimum 
energy consumption, we need to lower bound Equation |5] 
From Lemma [T] it follows that 

n-l/3-l n-1 



i=0 j=0 



(10) 



i=0 



Lemma 5: In every iteration of the for loop at Lines 6][T^ 
host Hi,i E {0,..,n — 1} serves one of the blocks it has 
already downloaded. 

Proof In the 5"^ iteration, q > I, Hi serves block 

b{i+j) mod n = ^(j+ij+n-l) mod n — mod n- FrOm the 

previous lemma, after the {q — 1)"^ iteration, the state of i is 

q-l 

O'i = [J {b(i+p) mod n} (14) 
p=0 

which includes 6(j+q_i) ^od n- Hence the claim follows. ■ 
Theorem 8: After the termination of Algorithm [T] each 
client Hi, i E {0, n — 1}, has received all the blocks bj E B 
with optimal energy E{zi) — n{As + J^^^o ^«)- 

Proof It follows from Lemma H that after the {n— 1)"^ 
iteration of the loop at Lines |6][T2| each host has received 
all the blocks. The scheme is then correct, since each host 
serves a block it has already downloaded (Lemma |5]l. Each 
host (including the server) is active exactly n slots. Then, the 



H„ I e {0, 



1}, has received all the blocks bj e B with 



optimal energy E{z2) = /3(A5 + X^ILo^ ^O- 

Proof: It follows from Lemma |7] that each host has 
received all the blocks at the end of the loop at Lines T4p0" 



total energy consumed is E{zi) — n{Ag + J^^Zq A^), which Theorem 9: After the termination of Algorithm |2] each host 
is optimal since it matches the lower bound. ■ 

2) Algorithm |2]- Let us denote the scheme described by 
Algorithm [2] as Z2- This scheme has the following properties. 

Observation 2: After the for loop at Lines T||5 the state of 
client i IS, (Ji — {bi], Vi e {0, .., n — 1}. 

Lemma 6: After the g*^ iteration of the loop at Lines 6p3 
for q € {0, 1, .., /3 — n}, each host Hi, i £ {0, n — 1}, has 
state 

= U{W)} (15) 

p=0 



Proof: We use induction on q to prove the lemma. The 
base case {q = 0) follows from the observation. 

Induction step: Assume the hypothesis to be true for the 
(g— l)*'^ iteration. Client Hi, i e {0, n — 2} receives block 



in the q iteration, while client iJ„_i receives block 
b{q+n-i) from the server. Thus, Vi G {0, n — 1}, the state 
of client Hi after the q^^ iteration is 



= [j{b{l+p)}^{hz+q)} ^ [jih+p)} 



p=0 



p=0 



14 



1}, 



Lemma 7: After the g'"^ iteration of the loop at Lines 
20] for q' e {0, 1, ..,n~l}, each host Hi,i € {0, 1, .., n- 
has state 

Ci = [J {b{i+p) mod p} (16) 
p=0 

Proof: We use induction on q' to prove the claim. The 
base case iq' — 0) follows from Lemma [6] with q = fi — n. 
Let the claim (induction hypothesis) be true for the {q' — 1)"^ 
iteration. In the g'*'' iteration, the value of j is j = g' + /? — 1. 
Hence, Hi receives block Thus, the state of client 

Hi after the (7'"^ iteration is 

i-\-p) mod 0}^{h^+q'+0~n) mod/?} 

p=0 

q'+li-n 

= U {^(i+p) mod fs} (17) 

p=0 

■ 

Lemma 8: During the execution of Algorithm |2] each host 
Hi,i g {Q,...,n — 1} serves a block that it has already 
downloaded. 

Proof: Let us consider the loops at Lines 6p3 and Lines 
14]|20 in sequence. In the g'^ iteration of these loops, host Hi 
serves block mod i3- From the previous lemmas, after 

the {q — 1)*^ iteration of these loops, host Hi has state 

f j = [J ^(i+p) mod l3 

which includes bi^i^^^i-f ^od p- Hence the claim follows. ■ 



Then, the scheme is correct since each host serves a block that 
it has already downloaded (Lemma [8]). Each host (including 
the server) is active exactly /3 slots. Then, the total energy 
consumed is E{z2) — Pi^s + X^iLo' ^2)' which is optimal 
since it matches the lower bound. ■ 
3) Algorithm^ For the correctness and optimality proofs 
of Algorithm |3] we define the state of a block b,. at the 
end of T as the set of clients Hi,i E {0, n — 1}, who have 
received br- Thus, to start with, Vr e {0, /?— 1}, initially the 
state of block br is = ^- After the makespan t| of scheme 
z, the state should be, Vr e {0, ...,/?-!}, C,r- = 1)7=0 {H^} 
Let us denote the scheme described by Algorithm [5] as Z3. 
This scheme has the following properties. 

Observation 3: After the for loop at Lines [T]|5] Vr £ 
{0, 1, .., 13 ~ 1}, the state of block br is Cr = {Hr}- 

Lemma 9: After the g"^ iteration of the for loop at Lines 



6p3l for q e {0, 



I — /?}, the state of block br is 

Cr = [J{^^r+p} 



(18) 



Proof: We prove the claim using induction on q. The base 
case (q — 0) is trivially true by the observation. Assume the 
statement to be true for the {q — 1)"^ iteration. In the q*^^ 
iteration, q = j + 1 — f3. Then, block br is served to Hr+q- 
Thus, the state of block br after the q*"^ iteration is 

9-1 9 

Cr = [J {Hr+p} U {Hr+q} = [J {Hr+p} 



p=0 



p=0 



Lemma 10: After the q'*"^ iteration of the for loop at Lines 
l4pT] for q' e {0, 1, .., /3 - 1}, the state of block br is 

n-P q' 

Cr = [J {Hr+p} [J {i?(r-p) mod n} (19) 
p=0 p=0 

Proof: The base case (q' — 0) is true from Lemma |9] after 
the loop at Lines 6p3 completes. In iteration q' = j + 1 — n, 
block bp^i is served to Hp^qi^i, hence, 

n-P q'-l 

Cp-i = U {Hp+p^,} U {i/^-i-p} u 

p=0 p=0 

and block br, r e {0, 1, .., /3 — 2}, is served to H^^r-q') mod n- 
Then, the state of block br, r e {0, - 1}, after the q'^^ 
iteration is 

n-l3 q'-l 

Cr = U {i^r+p} U r—p) mod n } U {H(^r-q') mod n} 

p=0 p=0 
r!-/3 (j' 

— [J {-ffr+p} [J {^(r-p) mod n} 
p=0 p=0 



Lemma 11: During the execution of Algorithm [3] each host 
Hi,i e {0,1,. .,n — 1}, serves a block that it has already 
downloaded. 

Proof: In the for loop at Lines 6pJ during iteration q — 
j + 1 — /3,q ^ {1, .., n — /3}, block br is served by Hr+q-i- 
It has it because after iteration q — 1, 

9-1 
p=0 

which includes Hr+q-i- Hq always serves if any, which 
it has from the above observation. 

In the for loop at Lines T4pT during iteration q' = j + 1 — 
n, q' S {1, .., f3 — 1}, block bp-i is served by Hn-q'. It has it 
because after iteration q' — 1, 

n-P q'~l 
p=0 p=0 

which includes H^-q'^q' G {l,2,..,/3- 1}. 

Block br, r e {0, 1, .., 13-2} is served by i?(r-(g'-i)) mod «■ 
It has it because after iteration q' — 1 

n-P q'-l 

Cr — \^ {Hr+p} [J {H(^r-p) mod n} 
p=0 p=0 



which includes H, 



(r— (q' — 1)) mod 



„. Hence, the claim follows. 



Theorem 10: After the termination of Algorithm |3] each 
host Hi, i £ {0, n — 1} has received all the blocks br E B 
with optimal energy £{23) = (3 [l\s + YJ^Zo ^i) + {n - 
/?) min{As,Ao}. 

Proof: It follows from Lemma [TO] that each host has 
received all the blocks. Then, the scheme is correct since each 



host serves blocks it has already downloaded (Lemma 11 



We need to bound now the energy consumed. Let us denote 
Aniin = minjAs, Ao}. The energy consumed in the loop at 
Lines T]|5 is easily observed to be 



/3-1 



i=0 

The energy consumed in the loop at Lines 6pJ 



(20) 



IS 



ji-i / p-i \ 

-B2 = ^ I Aniin + Aj + i_^ + ^ Aj+j + i_^ 
\ i=l / 

n-1 I3~l 

= {n - /3)Ainin + Ai+j- + i_^ 



(21) 



Finally, the energy consumed in the loop at Lines T4pT is 

n+/3-2 / /3-2 \ 

( \i+P-j-2 + ^ ^(n+i-j-l) mod n 1 
j=n \ 2=0 / 

n+P-2 p-l 

^ ^ ^ ^ ^{n+i—j — l) mod n 
j=n 'i=0 

y^ ^(j-J-l) mod n (22) 
j=0 1=0 



Adding Equation 20 21 and 22 we get. 



El + E2 + E3, 



fs-i 



= (3 As + (n - /3)A,„in + 

1=0 

n-P~ip-l /3-2/3-1 

+ X! X! + l + X! XI mod n 

j=0 1=0 j=0 i=0 

= (3 As + {n- /3)A,nin 

^-1 / i+n-P i-l n-1 

+E ^^+ E ^^■+E^j+ E 

1=0 y 3=i+l 3=0 j=i+n-l3+l 

fS-l n-1 

= f3As + {n- /3)A„iin + E E 

1=0 j=0 

which is optimal. ■ 
E. Proof of Theorem |4] 

From Theorems |2] and [3] the energy consumption of an 
optimal scheme z in an energy homogeneous system is 

/PB ^ 

E{z) = (n/3 + max{n, • — 5- + (5 

V up 

To find the optimal value of (3, we need to minimize the right 



(23) 



hand side of Equation 23 This can be written as a function 

of /3 as 

PB 

(n + 1) +^(n + l)/3, P>n(24) 

^^P) = { nPB f I 



Note that in Equation 24 the first term is a constant and the 
second is linear in /3. This is a straight line with positive slope 
5{n+l). Hence, the function attains the minimum at the lower 



extreme (3 — n, where it intersects Equation 25 Hence it 



is enough to consider Equation 25 for (3 < n. Minimizing 



Equation 25 with respect to f3 we get. 



PB 

uS 



(26) 



When this value is larger than n the value (3 — n has to be 
used. 



F. Proofs of Theorem |5] 

Proof: It can be easily observed that every slot in which 
a host receives its first block is a tree slot (since it does not 
serve anyone). Additionally, no two clients can receive their 
first block in the same slot in a normal scheme. Then, there 
are at least n tree slots. 

According to Definition |4j the cost c| ^ of a block can only 
take values 0, A or 2A. Let us consider a slot r. We denote 
with #0, #1, and #2 the number of blocks whose cost is 
0, A, and 2A in r, respectively. Then, we can prove that 
if r is a tree slot, then #2 = ^^0 + 1, while if r is a slot 
with a cycle, then #2 = ^0. The proof of this claim goes 
as follows. From Theorem [T] the cost of all blocks in r add 
up to the cost of T. Since all hosts have the same A, then 
• #0 + 1 • #1 + 2 • #2 = \T^\. In a tree slot the number of 
blocks sei-ved is :^0 + #H-#2 = — 1, while in a slot with 
a cycle the number of blocks served is #0 + #1 + #2 = 
Hence the claim follows. 

This implies that, if x blocks are served in slot r, the cost 
of r is = xA if r is a slot with a cycle, and = (2: + 1)A 
if r is a tree slot. Since the total number of blocks served is 
nj3 and there are at least n tree slots, the bound follows. ■ 

G. Proofs of Algorithm^ 

The proof of correctness of Algorithm |4] can be divided 
in essentially four parts. (We use the array abstraction for 
clarity.) The first claim is that, after the first loop (Lines [2]- 
|6]l, the diagonal of the first subarray has been filled. (I.e., 
An — l,yi G {0, n—1}.) This claim follows trivially by in- 
spection. The second claim is that after the second loop (Lines 
TpTi, the top left comer position of each subarray has also 
been set to L (I.e., Aa-j = l,Vj G {0,n,2n, .., - l)n}.) 
This claim also follows by inspection. 

The third claim is that, after the 5"^ iteration of the third 
loop (Lines T2pT 1, the whole q^^ subarray and the diagonal of 
the (g+l)*'' subarray have been set to 1 (and the blocks served 
by a host were available at the host for being served). This can 
be shown by induction on q, where the base case is the first 
claim above. In the induction step, the proof that the whole 
g*'' subarray is set to 1 is similar to the proof of Algorithm 
[1] The proof that the diagonal of the {q + 1)*'' subarray is 
set follows from the second claim above and Line [15] of the 
algorithm. 

Finally, the fourth claim is that the process described in 



Line 22 completes the array. The proof of this claim is very 



similar to the proof of Algorithm [2] 

Let us now compute the energy consumed by the scheme 
described by the algorithm. The first loop consumes energy 
El = 2nA. The second loop consumes E2 — 2([/3/nJ — 1)A. 
The third loop uses energy 



Line E2 is 




A{b{n+l)+n{n~l)). 



Adding up all these terms 

E{zi) = A ( n(/3 4 



E.^A 



E 

/=0 j=0 



1) = A(L^J-I)(n2-1) 
n 



Finally, the energy consumed by the process described in 



